Classification of Uncertain Data Using Selection Algorithm

نویسندگان

  • K. Soundararajan
  • Suresh Kumar
  • C. Anusuya
چکیده

Traditional machine learning algorithms assume that data are exact or precise. However, this assumption may not hold in some situations because of data uncertainty arising from measurement errors, data staleness, and repeated measurements etc., these kinds of uncertainty have to be handled cautiously, or else the mining results could be unreliable or even wrong. In this paper, we focus on classifying uncertain data by classification and prediction algorithm called setBase. This algorithm introduces new measures for generating, pruning and optimizing. Probability distribution function and uncertain data intervals are computed by this kind of algorithm .Based on these new measures, the optimal splitting attribute and splitting value can be identified and used for classification and prediction. Uncertainty in both numerical and categorical data can be processed by this measure. Our experimental results show that setBase has excellent performance even when data is

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Negative Selection Based Data Classification with Flexible Boundaries

One of the most important artificial immune algorithms is negative selection algorithm, which is an anomaly detection and pattern recognition technique; however, recent research has shown the successful application of this algorithm in data classification. Most of the negative selection methods consider deterministic boundaries to distinguish between self and non-self-spaces. In this paper, two...

متن کامل

‎A Bayesian mixture model‎ for classification of certain and uncertain data

‎There are different types of classification methods for classifying the certain data‎. ‎All the time the value of the variables is not certain and they may belong to the interval that is called uncertain data‎. ‎In recent years‎, ‎by assuming the distribution of the uncertain data is normal‎, ‎there are several estimation for the mean and variance of this distribution‎. ‎In this paper‎, ‎we co...

متن کامل

Feature selection using genetic algorithm for classification of schizophrenia using fMRI data

In this paper we propose a new method for classification of subjects into schizophrenia and control groups using functional magnetic resonance imaging (fMRI) data. In the preprocessing step, the number of fMRI time points is reduced using principal component analysis (PCA). Then, independent component analysis (ICA) is used for further data analysis. It estimates independent components (ICs) of...

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

SFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy

 In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....

متن کامل

An Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification

The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012